177 research outputs found

    Understanding Convolutional Neural Networks in Terms of Category-Level Attributes

    Full text link
    Abstract. It has been recently reported that convolutional neural net-works (CNNs) show good performances in many image recognition tasks. They significantly outperform the previous approaches that are not based on neural networks particularly for object category recognition. These performances are arguably owing to their ability of discovering better image features for recognition tasks through learning, resulting in the acquisition of better internal representations of the inputs. However, in spite of the good performances, it remains an open question why CNNs work so well and/or how they can learn such good representations. In this study, we conjecture that the learned representation can be interpreted as category-level attributes that have good properties. We conducted sev-eral experiments by using the dataset AwA (Animals with Attributes) and a CNN trained for ILSVRC-2012 in a fully supervised setting to ex-amine this conjecture. We report that there exist units in the CNN that can predict some of the 85 semantic attributes fairly accurately, along with a detailed observation that this is true only for visual attributes and not for non-visual ones. It is more natural to think that the CNN may discover not only semantic attributes but non-semantic ones (or ones that are difficult to represent as a word). To explore this possibility, we perform zero-shot learning by regarding the activation pattern of upper layers as attributes describing the categories. The result shows that it outperforms the state-of-the-art with a significant margin.

    Semantic bottleneck for computer vision tasks

    Full text link
    This paper introduces a novel method for the representation of images that is semantic by nature, addressing the question of computation intelligibility in computer vision tasks. More specifically, our proposition is to introduce what we call a semantic bottleneck in the processing pipeline, which is a crossing point in which the representation of the image is entirely expressed with natural language , while retaining the efficiency of numerical representations. We show that our approach is able to generate semantic representations that give state-of-the-art results on semantic content-based image retrieval and also perform very well on image classification tasks. Intelligibility is evaluated through user centered experiments for failure detection

    Learning the Roots of Visual Domain Shift

    Get PDF
    In this paper we focus on the spatial nature of visual domain shift, attempting to learn where domain adaptation originates in each given image of the source and target set. We borrow concepts and techniques from the CNN visualization literature, and learn domainnes maps able to localize the degree of domain specificity in images. We derive from these maps features related to different domainnes levels, and we show that by considering them as a preprocessing step for a domain adaptation algorithm, the final classification performance is strongly improved. Combined with the whole image representation, these features provide state of the art results on the Office dataset.Comment: Extended Abstrac

    Visualizing Convolutional Networks for MRI-based Diagnosis of Alzheimer's Disease

    Full text link
    Visualizing and interpreting convolutional neural networks (CNNs) is an important task to increase trust in automatic medical decision making systems. In this study, we train a 3D CNN to detect Alzheimer's disease based on structural MRI scans of the brain. Then, we apply four different gradient-based and occlusion-based visualization methods that explain the network's classification decisions by highlighting relevant areas in the input image. We compare the methods qualitatively and quantitatively. We find that all four methods focus on brain regions known to be involved in Alzheimer's disease, such as inferior and middle temporal gyrus. While the occlusion-based methods focus more on specific regions, the gradient-based methods pick up distributed relevance patterns. Additionally, we find that the distribution of relevance varies across patients, with some having a stronger focus on the temporal lobe, whereas for others more cortical areas are relevant. In summary, we show that applying different visualization methods is important to understand the decisions of a CNN, a step that is crucial to increase clinical impact and trust in computer-based decision support systems.Comment: MLCN 201

    Hierarchical ResNeXt Models for Breast Cancer Histology Image Classification

    Full text link
    Microscopic histology image analysis is a cornerstone in early detection of breast cancer. However these images are very large and manual analysis is error prone and very time consuming. Thus automating this process is in high demand. We proposed a hierarchical system of convolutional neural networks (CNN) that classifies automatically patches of these images into four pathologies: normal, benign, in situ carcinoma and invasive carcinoma. We evaluated our system on the BACH challenge dataset of image-wise classification and a small dataset that we used to extend it. Using a train/test split of 75%/25%, we achieved an accuracy rate of 0.99 on the test split for the BACH dataset and 0.96 on that of the extension. On the test of the BACH challenge, we've reached an accuracy of 0.81 which rank us to the 8th out of 51 teams

    DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks

    Full text link
    In recent years numerous advanced malware, aka advanced persistent threats (APT) are allegedly developed by nation-states. The task of attributing an APT to a specific nation-state is extremely challenging for several reasons. Each nation-state has usually more than a single cyber unit that develops such advanced malware, rendering traditional authorship attribution algorithms useless. Furthermore, those APTs use state-of-the-art evasion techniques, making feature extraction challenging. Finally, the dataset of such available APTs is extremely small. In this paper we describe how deep neural networks (DNN) could be successfully employed for nation-state APT attribution. We use sandbox reports (recording the behavior of the APT when run dynamically) as raw input for the neural network, allowing the DNN to learn high level feature abstractions of the APTs itself. Using a test set of 1,000 Chinese and Russian developed APTs, we achieved an accuracy rate of 94.6%

    VConv-DAE: Deep Volumetric Shape Learning Without Object Labels

    Full text link
    With the advent of affordable depth sensors, 3D capture becomes more and more ubiquitous and already has made its way into commercial products. Yet, capturing the geometry or complete shapes of everyday objects using scanning devices (e.g. Kinect) still comes with several challenges that result in noise or even incomplete shapes. Recent success in deep learning has shown how to learn complex shape distributions in a data-driven way from large scale 3D CAD Model collections and to utilize them for 3D processing on volumetric representations and thereby circumventing problems of topology and tessellation. Prior work has shown encouraging results on problems ranging from shape completion to recognition. We provide an analysis of such approaches and discover that training as well as the resulting representation are strongly and unnecessarily tied to the notion of object labels. Thus, we propose a full convolutional volumetric auto encoder that learns volumetric representation from noisy data by estimating the voxel occupancy grids. The proposed method outperforms prior work on challenging tasks like denoising and shape completion. We also show that the obtained deep embedding gives competitive performance when used for classification and promising results for shape interpolation

    Part Detector Discovery in Deep Convolutional Neural Networks

    Full text link
    Current fine-grained classification approaches often rely on a robust localization of object parts to extract localized feature representations suitable for discrimination. However, part localization is a challenging task due to the large variation of appearance and pose. In this paper, we show how pre-trained convolutional neural networks can be used for robust and efficient object part discovery and localization without the necessity to actually train the network on the current dataset. Our approach called "part detector discovery" (PDD) is based on analyzing the gradient maps of the network outputs and finding activation centers spatially related to annotated semantic parts or bounding boxes. This allows us not just to obtain excellent performance on the CUB200-2011 dataset, but in contrast to previous approaches also to perform detection and bird classification jointly without requiring a given bounding box annotation during testing and ground-truth parts during training. The code is available at http://www.inf-cv.uni-jena.de/part_discovery and https://github.com/cvjena/PartDetectorDisovery.Comment: Accepted for publication on Asian Conference on Computer Vision (ACCV) 201

    Right for the Right Reason: Training Agnostic Networks

    Get PDF
    We consider the problem of a neural network being requested to classify images (or other inputs) without making implicit use of a "protected concept", that is a concept that should not play any role in the decision of the network. Typically these concepts include information such as gender or race, or other contextual information such as image backgrounds that might be implicitly reflected in unknown correlations with other variables, making it insufficient to simply remove them from the input features. In other words, making accurate predictions is not good enough if those predictions rely on information that should not be used: predictive performance is not the only important metric for learning systems. We apply a method developed in the context of domain adaptation to address this problem of "being right for the right reason", where we request a classifier to make a decision in a way that is entirely 'agnostic' to a given protected concept (e.g. gender, race, background etc.), even if this could be implicitly reflected in other attributes via unknown correlations. After defining the concept of an 'agnostic model', we demonstrate how the Domain-Adversarial Neural Network can remove unwanted information from a model using a gradient reversal layer.Comment: Author's original versio

    Food Recognition using Fusion of Classifiers based on CNNs

    Full text link
    With the arrival of convolutional neural networks, the complex problem of food recognition has experienced an important improvement in recent years. The best results have been obtained using methods based on very deep convolutional neural networks, which show that the deeper the model,the better the classification accuracy will be obtain. However, very deep neural networks may suffer from the overfitting problem. In this paper, we propose a combination of multiple classifiers based on different convolutional models that complement each other and thus, achieve an improvement in performance. The evaluation of our approach is done on two public datasets: Food-101 as a dataset with a wide variety of fine-grained dishes, and Food-11 as a dataset of high-level food categories, where our approach outperforms the independent CNN models
    • …
    corecore